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Abstract. We consider a special case of the generalized minimum span¬ 
ning tree problem (GMST) and the generalized travelling salesman prob¬ 
lem (GTSP) where we are given a set of points inside the integer grid 
(in Euclidean plane) where each gride cell is 1 x 1. In the MST version 
of the problem, the goal is to find a minimum tree that contains exactly 
one point from each non-empty grid cell (cluster). Similarly, in the TSP 
version of the problem, the goal is to hnd a minimum weight cycle con¬ 
taining one point from each non-empty grid cell. We give a (1 -I- 4%/2 -|- e) 
and (1.5 -I- 8\/2 -|- e)-approximation algorithm for these two problems in 
the described setting, respectively. 

Our motivation is based on the problem posed in [7] for a constant ap¬ 
proximation algorithm. The authors designed a PTAS for the more spe¬ 
cial case of the GMST where non-empty cells are connected end dense 
enough. However, their algorithm heavily relies on this connectivity re¬ 
striction and is unpractical. Our results develop the topic further. 

Keywords: generalized minimum spanning tree, generalized travelling 
salesman, grid clusters, approximation algorithm. 


1 Introduction 

The generalized minimum spanning tree problem (GMST) is a generalization 
of the well known minimum spanning tree problem (MST). An instance of the 
GMST is given by an undirected graph G = (V, E) where the vertex set is 
partitioned into k clusters Vi, i = 1,... ,k, and a weight w{e) G K"*" is assigned 
to every edge e G E. The goal is to find a tree with minimum weight containing 
one vertex from each cluster. 

The GMST occurs in telecommunications network planning, where a network 
of node clusters need to be connected via a tree architecture using exactly one 
node per cluster m- More precisely, local subnetworks must be interconnected 
by a global network containing a gateway from each subnetwork. For this inter¬ 
networking, a point has to be chosen in each local network as a hub and the hub 
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point must be connected via transmission links such as optical fiber, see |16| . 
Furthermore, the GMST has some applications in design of backbones in large 
communication networks, energy distribution, and agricultural irrigation |12j . 

The GMST was first introduced by Myung, Lee and Tcha in 1995 [16]. Al¬ 
though MST is polynomially solvable [8|, it was shown in [T6| that the GMST 
is strongly NP-hard and there is no constant factor approximation algorithm, 
unless P=NP. However, several heuristic algorithms have been suggested for the 
GMST, see [Illl2ll8ll9j . Furthermore, Pop, Still and Kern |5D| used an LP- 
relaxation to develop a 2p— approximation algorithm for the GMST where the 
size of every cluster is bounded by p. 

In [7|, Feremans, Grigoriev and Sitters consider the geometric generalized 
minimum spanning tree problem in grid clusters^ GGMST for short. In this spe¬ 
cial case of the GMST, a complete graph G = (V, E) is given where the set of 
vertices V correspond to a set of points in the planar integer grid. Every non¬ 
empty 1x1 cell of the grid forms a cluster. The weight of the edge between 
two vertices is given by their Euclidean distance. Eig. depicts one instance of 
the GGMST. We say that two grid cells are connected if they share a side or a 



j- 2 i-1 j j + 1 j + 2 


Fig. 1. An GGMST instance with n = 21 points and M -|-1 = 8 non-empty cells, which 
are connected and ht into a 3 x 5 sub-grid 

corner. Eurthermore, we say that a set of grid cells is connected if they form one 
connected component. The authors in [^ show that the GGMST is strongly NP- 
hard, even if we restrict to instances in which non-empty grid cells are connected 
and each grid cell contains at most two points. Furthermore, they designed a dy¬ 
namic programming algorithm that solves in k'^) time the GGMST 

for which the set of non-empty grid cells is connected and fits into k x I sub-grid. 
(Note that the algorithm is polynomial if k is bounded.) Moreover, the authors 
used this algorithm to develop a polynomial time approximation scheme (PTAS) 
for the GGMST for which non-empty cells are connected and the number of non¬ 
empty cells is superlinear in k and 1. The GGMST instances are often used to 
test heuristics for the GMST which, in light of the results in |7] , is not adequate. 
The objective of this paper is to develop this topic further and to design a simple 
approximation algorithms for the GGMST and of its variants without restricting 
only to connected and dense instances. 

Analogously as the GMST and the GGMST, the generalized travelling sales¬ 
man problem (GTSP) and the geometric generalized travelling salesman prob- 
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lem in grid clusters (GGTSP) can be defined. The GTSP was introduced by 
Henry-Labordere m and is also known in the literature as set TSP, group TSP 
or One-of-a-Set TSP. This problem has many applications, including airplane 
routing, computer file sequencing, and postal delivery, see mm- Elbassioni, 
Fishkin, Mustafa and Sitters [S] considered the GTSP in which non-empty clus¬ 
ters (i.e. regions) are disjoint a-fat objects with possibly varying size. In this 
setting they obtained a (9.1a -I- l)-approximation algorithm. They also give the 
first 0(l)-approximation algorithm for the problem with intersecting clusters 
(regions). Note that in the GGTSP, fatness of each cluster is 4 (each cluster is 
a square). 

As a special case of the GTSP we can look at each geometric region as an 
infinite set of points. This problem, called the TSP with neighbourhood, was 
introduced by Arkin and Hassin [T]. In the same paper they present constant 
factor approximation algorithm for two cases in which the regions are translates 
of disjoint convex polygons, and for disjoint unit disks. For the general prob¬ 
lem Mata and Mitchell and later on Gudmundsson and Levcopoulos cni, 
gave an ©(lognj-approximation algorithm. For intersecting unit disks an 0(1)- 
approximation algorithm is given in [?]. Safra and Schwartz |21j show that it 
is NP-hard to approximate the TSP with neighbourhood within (2 — e). In this 
context, it is natural to consider the GTSP in which points are sitting inside 
geometric objects such as the integer grid. 

Notation. We will usually refer to vertices as points. Throughout this paper, 
the number of points (|P|) will be denoted by n. Furthermore, N denotes the 
number of edges in every feasible solution (tree) of the GGMST, i.e. N is the 
number of non-empty cells minus 1. The edge between two points u and v will 
be denoted by eu,v We naturally extend the notation for the weight to sets of 
edges and graphs, i.g. the weight of a tree T is denoted by uj{T) = X]eGT'^(®)’ 
where e € T means that e is an edge of T. We assume that every point is in just 
one cell, i.e. points on the cell borders are assigned to only one neighbouring 
cell by any rule. An optimal solution of the GGMST will be denoted by Topt 
throughout this paper. 

Our results and organization of the paper. The main result of this paper 
is a (1 -I- 4\/2 -I- e)-approximation algorithm for the GGMST. We do not assume 
any restrictions on connectivity, density or cardinality of non-empty cells. The 
algorithm is presented and analyzed in Sectionj^ A lower bound for the weight of 
an optimal solution in terms of JV is used to prove the approximation ratio of the 
algorithm. Sectionis devoted to proving this lower bound. Lastly, in Section]^ 
we use our GGMST algorithm to develop an approximation algorithms for the 
GGTSP. 

2 The GGMST Approximation Algorithm 

In this section we present a (l-|-4-\/2 + e)-approximation algorithm (Algorithm]^ 
for the GGMST. Main part of the algorithm is Algorithm which we describe 
next. 
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Algorithm 1: + 4-\/2 + ^ -approximation alg. for the GGMST 

1 T <r- solution of the MST problem on non-empty cells (where the distance 

2 between a pair of cells is the length of the shortest edge between them); 

3 G the graph consisting of the set of edges (and points) that correspond to the 

4 edges in T; 

5 for all cells C that contain more than one point from G do 

6 Cg ^ the set of points from G that are in G; 

7 p <r- point from G that is a median for Gg; 

8 Replace Cq by p, i.e. reconnect to p all edges of G that enter G; 

9 end 

10 return G; 


Algorithm is divided into two parts; in the first part we solve an MST 
instance defined as follows: non-empty cells play the role of vertices, and the 
weight of the edge between two cells Ci,C 2 is the smallest weight edge 
where pi S Ci and P 2 G C 2 - Let T be an optimal tree of such MST instance, 
and let graph G be the set of edges (with its endpoints) of the original GGMST 
instance that correspond to the edges of T. Note that G has N edges and spans 
all non-empty cells but it can have multiple points in some cells. In the second 
part of the Algorithm(i.e. the for loop), we modify G to obtain the GGMST 
feasibility, by iteratively replacing multiple cell points by a single point p. We 
choose point p to be the one that has the minimum sum of distances to other 
points of G that are in the corresponding cell. 

Next we present an upper bound for solutions obtained by Algorithm in 
terms of the number of edges N. 

Theorem 1. Algorithm^produces a feasible solution withwiTA.) < w{Topt)+ 
y/2N — y/2, where N is the number of edges of Ta ■ 

Proof. Denote by Gq the non-feasible graph obtained in the first part of the 
algorithm, i.e. the first version of graph G. Then the weight of the solution Ta 
obtained by the algorithm is equal to w{Gq) + ext, where ext is the amount by 
which we increase (extend) the weight of Go in the second part of the algorithm. 
Note that w{Go) < w(Topt), as Gq is an optimal solution of the problem for 
which Topt is a feasible solution (find a minimum weight set of edges that spans 
all non-empty cells, with all GGMST edges being allowed). In the rest of the 
proof we will bound the value of ext. 

In every run of the for loop we replace the set of points Gg with p. In doing so, 
every edge c S Gg from G, is replaced by Cq^p. From the triangle inequality 
we get that w{eq^p) — w{eq^c) < w{ec,p). Hence, the increase (extension) of the 
weight of G in every run of the for loop is less or equal than X^cgCg 
I nstead of bounding such absolute values, we will bound its average per edge 
adjacent to the corresponding cell. More precisely, we will calculate an average 
extension per half-edge assigned to the corresponding cell. Namely, every edge 
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will be extended at most two times, once on each endpoint, so we can look 
at each extension as an extension of a half-edge. Furthermore, note that edges 
that contain leafs will be extended only on one side. We will use this fact to 
assign half-edges that contain leafs to other cells to lower their average half-edge 
extension. To every cell C, we will assign \Cg \ — 2 leaf half-edges. Intuitively, we 
can do this because every node w of a tree generates deg(?;) — 2 leafs. Formally, 
it follows from the following well known equality: 

\v,\ = 2+Y,mi-2), ( 1 ) 

i>2 

where Vi = {v € V deg(u) = i}, and V is the set of vertices of a graph. 

Then for a cell C the average extension per assigned half-edges is bounded 
above by 

ScgCo 

|Cg| + (|Cg|-2)- 

Note that the maximum distance between two cell points is \/2. Since points 
from Cq are candidates for p, it follows that — V^dCcI — 1). 

Hence, ([^ is bounded above by 

V2{\Cg\-1) ^ 

2\Cg\-2 2 ■ 

Hence, in average, every half-edge (except 2 leaf half-edges, see 0 ) is extended 
by at most v^/S. Note that this average bound is a constant, i.e. does not depend 
on C. Now ext can be bounded by 

/2 

ext<^{2N-2) = V2N-V2. (3) 

Finally, we can bound the solution Ta of the algorithm by 

w{Ta) < w(Go) + ext < w{Topt) + V2N — \/2. 


□ 

The following theorem gives a lower bound for the optimal solution in terms 
of the number of edges N. Section is dedicated to proving the theorem. 

Theorem 2. If Topt is an optimal solution of the GGMST on N +1 non-empty 
cells, then N < 4:w{Topt) -|- 3. 

Now from Theorem and Theorem the following approximation bound for 
Algorithm [l] follows. 

Corollary 1. Algorithm^ produces a feasible solution Ta of the GGMST such 
that w(Ta) < (1 + ‘iV2)w{Topt) 2\/2. 
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Note that, due to the constant 2^/2, Corollarydoes not gives us a constant 
approximation ratio for Algorithm Namely, the approximation ratio that we 
get is equal to 1 + 4-\/2 + ■ Next we focus on improving Algorithm so 

that is replaced by arbitrary small e > 0. Note that the optimal solution 

weight does not necessarily increase with the increase of the number of points 
n, namely all points can be in the same cells. Hence we cannot use the standard 
approach. However, the following two facts will do the trick. First, note that the 
weight of the GGMST optimal solution increases as the number of non-empty 
cells increases. Second, given a spanning tree structure of non-empty cells T, 
we can in polynomial time find the minimum weight GGMST feasible solution 
T' with the same tree structure as T (i.e. there is an edge in T' between two 
cells if and only if these two cells are adjacent in T). Next we design a dynamic 
programming algorithm to find T' (see Algorithm]^. 

Given an GGMST instance, let T be a spanning tree of the complete graph 
where the set of vertices correspond to the set of non-empty cells. Denote by Xi 
the set of points inside cell Q. We observe T as a rooted tree with as its 
root. If Ci is a leaf of T then the weight W{z) of each point z in set Xi is set to 
zero. If Ci is not a leaf then T has some children ,..., Ci,. and the weight for 
points inside sets Xi -^^,. ■., Xi^, has already been computed. Then for each point 
p in cell Ci (set Xi) we compute: 

k 

W{p) min {W{q) + w{ep^q)} 

. , q&Xi- 

Algorithm computes W{p) for all p € Cr- Note that it is easy to adapt Algo¬ 
rithmic to store selected points at each step. 

Now we have all iMredients to design a (1 -I- 4\/2 -|- e)-approximation algo¬ 
rithm, see Algorithmic Note that 1 -I- 4-\/2 is approximately equal to 6.66. 

Theorem 3. For any e > 0, Algorithm^ is a {1 + Ay/2 + e)-approximation 
algorithm for the GGMST. 

Proof. If < 15 or TV < lO-v/2/e, then we can enumerate all spanning trees on 
TV -I- 1 non-empty cells, and apply Algorithm |C on each of them. That will give 
us an optimal solution in polynomial time. 

Assume TV > 15 and TV > lQy/2/e. By Gorollary|Cit follows that Algorithmic 
will produce a solution Ta such that 

w{Ta) < (l + 4^2) wiTopt) + 2y/2. (4) 

From Theorem 1C and TV > 15 it follows that 1 < 5w{Topt)/N. Applying that on 
the rightmost element of inequality we get 

w{Ta) < (l + w(Topt) + w(Topt), 

<^14- 4^2 -I- — w(Topt). 
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Algorithm 2: Optimal GGMST solution for a given spanning tree of cells 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 
17 


Data: A spanning tree T of non-empty cells 

Result: An optimal weight of the GGMST tree with the same structure as T 
Choose an arbitrary cell Cr as the root of T; 
for each leaf Ci of T do 
for each p £ Xi do 

I W^(p) = 0; 

end 

end 

CurrentLevel = height of T ; 
while CurrentLevel > root level do 

I for each node Ci of CurrentLevel do 

Let Cii,..., Cij, be children of Ci in T; 
for each p £ Xi do 

I W{p) = + w(ep.9)}; 

end 

end 

CurrentLevel = CurrentLevel — 1; 

end 

return minpgx^ W(p); 


Algorithm 3: (1 + 4-\/2 + e)-approximation algorithm for the GGMST 

1 if A < 15 or A < 10\/2/e then 

2 Output minimum weight solution obtained by Algorithm on all spanning 
trees of non-empty cells; 

3 else 

4 I Run Algorithmic 

5 end 


Now from N > 10-\/2/e it follows that 

wiTA) < (l + w{Topt), 

which proves the theorem. □ 

3 The Lower Bound Proof 

This section is entirely devoted to proving Theorem which gives us a lower 
bound on the weight of an optimal solution. The lower bound is expressed in 
terms of the number of edges N. 

Throughout this section we identify 1x1 grid cell with its coordinates 
where i,j S Z is the row and the column of the cell inside the inhnite integer 
grid. For example, in Fig. [C cell {i,j + 1) contains one point which is near its 
upper right corner. 
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We start by proving lower bounds for trees of small size. 

Lemma 1. The weight of any subtree ofTgpt with four edges is at least 1. 

Proof. Consider a subtree T' of Topt with four edges. Let H denote the set of 
the five cells that contain vertices of T'. Note that there will be two cells in H 
with coordinates {i,j) and {i',j') such that \i — i'\ > 2 or \j — j'\ > 2. Hence, 
Euclidean distance between a vertex from the cell (i,j) and a vertex from the 
cell {i',j') is a least 1. This implies w{T') > 1. See Fig. for an example. □ 


i+l 

i 

Fig. 2. An example of a tree T' with four edges 



T 



'—•- 

-• 


j j + 1 i + 2 


Lemma 2. The weight of any subtree ofT^pt with seven edges is at least |(2-\/6+ 
\/6 — 3-\/3) (which is greater than 1.93j. 

Proof. Let T' be a subtree of Tgpt with seven edges. If T' does not fit in any 
3x3 sub-grid of the original grid, then there are two vertices u, v of T' which are 
from cells with coordinates (f, j) and {i',j') such that \i — > 3 or \j — j'\ > 3. 

In that case w{eu,v) > 2 and therefore w{T') > 2. 

Next we consider the case when T' fits into 3x3 sub-grid. Since T' has eight 
vertices, at least three of them are in the corner cells of a 3 x 3 grid. Without 
loss of generality we assume that these three vertices are vertex v in cell (i,j), 
vertex u in cell (i -f 2, j) and vertex y in cell (j -I- 2, i). Let P be a shortest path 
in T' from u to u and let Q be the shortest path in T' from v to y. Note that 
u’(ey^u) > 1 and w{ev^y) > I. If P and Q do not have a common vertex apart 
from V, then w{T') > 2. Thus we are left with the case when P and Q have a 
common vertex other than u, which we denote by x. 

First we assume that P and Q do not go through the point in cell (f-l-1, j + l). 
In this case, up to symmetry, one of the configurations depicted in Fig. i (a,b) 
occurs. However, it is clear that w{ey^x) + w{ex,y) + w{ex,u) > 2 and hence 
w{T') >2. 

Lastly, we observe the case when vertex x is in cell (i-l-1, j-l-I). Then w(PUQ) 
is at least w{ex,v) -\-w{ex,u) w{ex,y)^ which is minimized when x is the Fermat 
point for the three corners of cell (i -I- 1, j -I- I) and T' has the structure depicted 
in Fig. [^(c). Therefore it can be computed that w{T') > ^(2y/6 + \/6 — 3-\/3) > 
1.93. □ 


Lemma 3. The weight of any subtree of Topt with eight edges is at least 2. 
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j i +1 j + 2 



j i +1 i + 2 



(a) 


(b) 


(c) 


Fig. 3. Layouts of P and Q 


Proof. Let T' be a subtree of Topt with eight edges. If T' does not fit in any 3x3 
sub-grid then by the same simple argument as in the proof of Lemma we get 
w{T') > 2. If T' fits in a 3 X 3 sub-grid, then there is one vertex of T' in any cell 
of such 3x3 grid. More specifically, there are vertices in cells (f, j), {i + 2, j), 
{i,j + 2) and {i + 2,j + 2) from which easily follows that w{T') >2. □ 

Lemma 4. The weight of any subtree of Topt with nine edges is at least 1 -|- '/S. 

Proof. Let T' be a subtree of Topt with nine edges. If T' does not fit in any 4x4 
sub-grid of the original grid, then there are two vertices u, v of T' which are in 
cells with coordinates (i, j) and {i',j') such that \i — i'\ > 4 or \j — j'\ > 4. In 
that case > 3 and therefore w{T') > 3 > 1 -I- v^. 

Next we consider the case when the smallest rectangular sub-grid that con¬ 
tains T' is of the size 4x4, and let {i,j) be the bottom left corner cell of 
such 4x4 grid. In that case there are four (not necessarily distinct) vertices 
u, u, X, y of T' that for some i < i’,i” < i + Z and j < j',j'' < j + 3 lie in cells 
{i',j), {i,j'), {i",j + 3), (i -I- 3, j"), respectively. Let P be the shortest path in T' 
from It to a; and let Q be the shortest path in T' from v to y. Let us observe 
the union of paths P and Q. This union is a set of k edges we denote by e^, 
£ = 1,... ,k. Let us denote by xi and ye the lengths of projections of ee on x-axis 
and j/-axis, respectively. Then 

w{P\jQ) = ^^xl + y'j. (5) 

t=i 

Since distance between projections of u and x on x-axis is at least 2 and distance 
between projections of v and y on y-axis is at least 2, it follows that 2 ;^ > 2 

and Vf. — 2. Hence, ([^ is minimized when k = 1 and xi = j/i = 2 with 

minimal value being 2^/2. Therefore we get w{T') > 2^/2 >1-1- -s/S. 

Lastly, we consider the case when T' fits into a rectangular sub-grid R of 
dimensions smaller than 4x4. Without loss of generality we can assume that 
R is of the size 4x3, and let (i,j) be the bottom left corner cell of R. Note 
that there are at least two vertices of T' that are in corner cells of R. Without 
loss of generality we assume that vertex v is in cell Next we distinguish 

remaining cases with respect to the position of the second corner point which 
we denote by u. 
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Case 1. Vertex u is in cell {i,j + 2). As there are ten vertices in T', one of 
them must be in cell {i + 3, j') for some j < f < j + 3. Denote such vertex by 
y. By calculating the Fermat point x it can be seen that weight of the Steiner 
tree containing u, v and y is at least 2 + •\/3/2 which is greater than 1 + i/S, see 
Fig.|4](a). 



i 1+1 1+2 

(a) 



1 1+1 1+2 


(b) 


Fig. 4. T' configurations cases 



1 1+1 1+2 
(c) 


Case 2. Vertex u is in cell (i + 3, j). We can assume that there are no vertices 
of T' in cells {i,j + 2) or (i + 3,j + 2) as then Case 1 applies. Then there must 
be vertices y',y'' in T' in cells {i + l,j + 2) and {i + 2,j + 2). Hence, ic(T') must 
be at least as the weight of the Steiner tree that contains right upper corner 
of cell right bottom corner of cell {i + 3, j) and left bottom corner of cell 

{i + 2, j + 2). By calculating the Fermat point, one can see that such Steiner 
tree has weight 1 + -s/S, hence rc(T') > 1 + In Fig. (b) subtree T' has the 
configuration that mimics such Steiner tree. 

Case 3. Vertex u is in cell {i + 3, j + 2). We can assume that there are no 
vertices of T' in cells {i,j + 2) or {i + 3, j) as then Case 1 or Case 2 apply. In 
this case minimal weight T' mimics the Steiner tree that contains right upper 
corner of cell (z, j), left bottom corner of cell {i + 3, j + 2), right bottom corner 
of cell {i + 2, j) and left upper corner of the cell (i + 1, j + 2), see Fig. (c). It 
is easy to calculate that the weight of such Steiner tree is \/5~+~2\/3 which is 
greater than I + -^/S. □ 

Now we are ready to prove Theorem 

Proof (of Theorem^. We will proof the theorem by induction on N. Recall that 
N is the number of edges in Tgpt- 

By Lemma Lemma and Lemma theorem holds for N < 13. Next we 
assume that theorem holds for all trees with number of edges strictly less than 
N. 

We will perform the induction step as follows: through exhaustive case study 
we will show that there always exist a subtree T' of Topt for which w{T') is 
greater or equal to number of edges of T' divided by 4, and if we remove from 
Topt the edges of T', it remains connected. In that case, by induction hypothesis 
the bound for Topt holds. 
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We observe Topt as a rooted tree, and given a vertex v of Topt, we denote by 
Ty the maximal subtree of Tgpt rooted at v. 

Let u be a non-leaf vertex of Topt with maximum number of edges in its path 
to the root. 

Assumption 1: We may assume u has at most two children. Namely, in the 
case when u has four children ui,U 2 ,U 3 ,U 4 let T' be a subtree of Ty induced 
by {u,ui,U 2 ,U 3 ,U 4 }. In the case when u has exactly three children ui,U 2 ,U 3 
set T' to be T„ where v is the parent of u. Note that in both cases T' has 
four edges. Let T” = Topt \ E(T') where E(T) denotes the set of edges of a 
tree T. Since T” is a tree, by induction hypothesis it follows that |if(T")| = 
A — 4 < AwiT") + 3. Furthermore, by Lemma we have that 4 < Aw[T'). 
Hence, N < Aw{T'') + Aw{T') 3 = Aw{Topt) + 3. 

Assumption 2: If u has exactly two children ui,U 2 , we may assume that 
the parent of u (denoted by v) has degree strictly greater than two. Namely, if 
this is not the case, we set T' = Ty U where w is the parent of v, and we 

set T" = Topt \ E{Tyy). Since T' has four edges and T" is a tree, by induction 
hypothesis for T" and Lemma we obtain the bound. 

Case 1: Vertex u has exactly two children Ui,U 2 . Then by Assumption 2 v 
has at least two children. By the choice of u, the number of edges in any path from 
u to a leaf in Ty is at most 2. Let w' be another child of v. By Assumption 1 
w' has at most two children. Also note that we can assume that w' has at least 
one child. Otherwise the subtree T' induced by {w',v,u,ui,U 2 } has four edges, 
hence by removing the edges of T' from T we can apply the induction hypothesis 
and obtain the bound. 

Case 1.1: Vertex v has another child w”. In this case using the same argu¬ 
ments as above it can be shown that w" must have exactly one or two children. 
Note that subtree T' induced by v,u,ui,U 2 together with Ty,i,Tyyii has at least 
seven edges and at most nine edges. Therefore, Lemma[^ Lemmaj^or Lemma|^ 
can be applied for each of the cases. Furthermore, for the remaining subtree 
Topt \ E{T') the induction hypothesis can be applied to obtain the bound. 

Case 1.2: Vertex v has only two children w',u. Let w be the parent of v. 
We can assume that w' has exactly one child, otherwise the subtree T' induced 
by the vertices of Ty and vertex w has exactly seven edges, hence we could 
use Lemma If the degree of w is two, then let T' be the subtree induced 
by Tyj together with the edge where y is the parent of w. T' has seven 

edges and therefore, the result follows. Now, we may assume that w has another 
child v'. Let Ti = Ty and observe that Ti has 5 edges. Let T 2 = Ty>. By the 
same argument used for Ty, we conclude that T 2 has at most five edges. Let 
r' = Ti U T 2 U {ey,,v', ew,v}- If T 2 has zero, one or two edges, then T' has at least 
seven and at most nine edges, and hence the bound follows. If T 2 has four edges 
then by induction hypothesis on Topt \ E{T 2 ) and by applying the Lemmaon 
T 2 , we obtain the bound. It remains to consider the cases when T 2 has three or 
hve edges. If T 2 has three edges, then we add edge eyy,v' to T 2 and now the new 
tree has four edges, hence we can apply the same arguments as before. We are 
left only with the case when T 2 has five edges. In this case w(T 2 ) > I, according 
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to Lemma and also T 3 = Ti U has seven edges. By Lemma 

either w{T ^ is at least 2, or it has the structure depicted in Fig. (c), and it 
is clear that every edge incident to the tree in Fig. (c) is grater than, say 0.5. 
Hence, in either case w{T') > 3. Since T' has twelve edges the bound is obtained 
by induction hypothesis on Topt \ EiT'). 

Case 2: Vertex u has exactly one child Ui. 

Case 2.1 Vertex v has another child w'. In this case T^,/ has depth at most 1. 
If w' has more than one child, then from Case 1 {w' instead of u) we are done. If 
w' has one child (denoted by Wi), then the subtree induced by {ui,u,v,w',Wi} 
has four edges and we are done. 

We continue by assuming that w' has no child. If v has another child w” ^ 
{u, w'}, then as we argued for w', we can assume that w" has no child. However, 
in this case subtree induced by {ui,u, v, w', w"} has four edges and we are done. 
Therefore we can assume that v has exactly two children w' and u. Let w be the 
parent of v. Then the subtree induced by ui,u,v,w',w has four edges and we 
are done. 

Case 2.2: Vertex v has only child u. Let w be the parent of v. W can assume 
that V has a sibling node v', as otherwise we can remove the four edge subtree 
induced by {ui,u,v,w, z}, where z is the parent of w. Furthermore, we can 
assume that v' has a child u', as otherwise we can remove the four edge subtree 
induced by {ui,u,v,w,v'}. 

Case 2.2.1: Vertex u' has no child but has a sibling u”. We can assume 
that no child of v' has a child, as we can observe such case as an instance of 
Case 2.2.3. Furthermore, we can assume that u' and u" are only children of v'. 
Otherwise, in the case when v' has more than three children, there would exist 
a subtree of T^,/ with four edges that we could remove. Furthermore, in the case 
when v' has exactly three children, we can remove T' = T„/ U {eu,,v>}- 

Hence we are left with the case when it" is the only sibling of it'. In the 
case V and v' are only children of w, we can remove seven edge subtree T' = 
Tw U {cuu^z}, where z denotes the parent of w. Lastly, we consider the case when 
there exist third child of w denoted by 1 ;". From the assumptions and solved 
cases above, we can assume that Tyn has at most two edges, hence subtree 
r' = Ty UT„/ UT„// U eyjy, eyj^y"} hus seven, eight or nine edges, therefore 

we can remove it. 

Case 2.2.2: Vertex it' has no child nor sibling. In the case there exists a 
third child of w, from the assumptions and solved cases above if would follow 
that we can assume that it has only one child which has no child. In that case 
thee would exist a subtree of Tyj with four edges that we can remove. Hence, we 
can assume that w has no other children besides v and v'. Then Tyj is a path 
with five edges. If w{Ty,) is grater than 5/4, we can remove it and we are done. 
Otherwise it must be similar to the structure depicted in Fig. i.e. with a path 
of approximate size 1 alongside a border of a cell, and with remaining vertices 
grouped at the endpoints of such path. Note that in that case, edge must 
be big enough so that w(Ty, U {e,y,z}) is greater than 6/4. Hence we can remove 
Tw U {ew,z} and by induction hypothesis obtain the bound. 


Approximation Algorithms for Generalized MST and TSP in Grid Clusters 


13 



Fig. 5. A short path with five edges 

Case 2.2.3: Vertex u' has a child u[. Note that from the assumption on 
maximality of depth of u, u'l has no children. As we solved Case 2.1, we can 
assume that u'-^ has no siblings. Furthermore, we can assume that there is no 
sibling of u' that has a child, as in that case there would exist subtree of T„/ 
with four edges that we could remove. Now in the case that u' has more than 
one sibling, again, there would exist subtree of T„/ with four edges that we could 
remove. In the case that u' has exactly one sibling, subtree T' = Tyi U {e.uj,v'} 
can be removed. We are left with the case when both Ty and Ty' are paths with 
two edges. In the case there is a third child of w, denoted by v”, from the solved 
cases above if follows that we can assume that Tyii is also a path with two edges. 
In that case there is a subtree of Ty, with nine edges that can be removed. In 
the case there is no third child of w, the seven edges subtree T' = T^, U {ey,^z} 
(with z being the parent of w), can be removed and the bound obtained. 

We considered all the cases, therefore proving the theorem. □ 

4 Approximation of the GGTSP 

Our approximation algorithms for the GGMST can be used to obtain approxi¬ 
mation algorithms for the geometric generalized travelling salesman problem on 
grid clusters (GGTSP) using standard methods. 

We start with the approach of shortcutting a double MST, presented in 
Algorithm 1^ and analyzed next. 


Algorithm 4: (2 + 8-\/2 -|- 2e)-approximation algorithm for the GGTSP 
Data: Instance I of the GGTSP 
Result: Generalized travelling salesman tour 

1 Ta ■<— output of Algorithm on /; 

2 Ge Eulerian graph obtained by doubling all edges in Ta', 

3 fT an Euler tour of Ge', 

4 C a GGTSP tour obtained by going along £T and skipping repeated vertices; 

5 return C; 


By removing one edge from a GGTSP tour, one obtains a GGMST tree, hence 
w{Ta) is less than (1 + 4\/2 + e)OPT, where OPT is the weight of an optimal 
solution of the GGTSP. Therefore, w{Ge) is less than 2{1 + + e)OPT. Due 

to triangle inequality, shorcutting the Euler tour in line 4 of the algorithm does 
not increase the weight. Hence, Algorithm |4| is a (2 -|- 8\/2 + 2e)-approximation 
algorithm for the GGTSP. Note that 2 + 8v2 is approximately equal to 13.31. 
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Next we use the approach from the famous Christofides |-approximation 
algorithm for the metric TSP, see [3]. This approach will give us 0.5 decrease 
of the approximation ratio. We give a sketch of the algorithm and the analysis, 
and leave details to the reader. 

We start by running Algorithm on the GGTSP instance. Let Tq be the 
resulting tree. Note that w{Tq) is less or equal than (1 + A^/2)OPT + 2-\/2, 
where OPT is the weight of an optimal solution of the GGTSP. Let S' be a set 
of non-empty cells that contain a vertex of Tq with an odd degree. Note that 
|S| is even. Let M be a minimum perfect matching among cells in S, where the 
distance between two cells Ci,C 2 G S is the smallest distance between two points 
Pi,P 2 among all pi G Ci, p 2 G € 2 - It is not hard to show that w{M) < ^OPT. 
Let Mq be the set of edges 64^^43 fo'' which ti,t 2 are vertices of Tq and there exist 
an edge G M such that pi and ti are in the same cell and P 2 and <2 are 

in the same cell. Note that w{Mg) < \OPT -\- Ny/2, and hence by Theorem 
we get that w{Mg) < \OPT + 4-\/20PT -I- 3y/2. By merging Mq and Tq we 
obtain an Eulerian graph, and by shortcutting one of its Euler tours we obtain 
a GGTSP tour with weight at most (| -I- 8\/2)OPT + 5\/2. By similar approach 
as in Algorithm and Theorem we can get rid of 5^/2 error, and obtain a 
(| + 8^/2 -b e)-approximation algorithm for every e > 0. 

5 Conclusions 

We presented a simple (1 -b 4\/2 + e)-approximation algorithm for the geometric 
generalized minimum spanning tree problem on grid clusters (GGMST) and 
(1.5-b8-\/2 + e)-approximation algorithm for the geometric generalized travelling 
salesman problem on grid clusters (GGTSP). 

To obtain guarantied approximation ratios for our algorithms, we used the 
following lower bound on the optimal solution: Every tree with N edges that con¬ 
tains at most one point from any 1x1 grid cell is of size at least . Obtaining 
a tight lower bound in terms of the number of edges would decrees guaranteed 
approximation ratios of our (and other similar) algorithms. Moreover, it would 
be an interesting result on its own. 

Acknowledgment: We would like to thank Geoffrey Exoo for many usefull 
discussions. 
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